Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Create and access a list of your products
Some article numbers may have changed. If this isn't what you're looking for, try searching all articles. Search articles

NetWorker: How to Debug Backup Operations

Summary: Several options are listed for debugging a failed NetWorker Backup.

This article applies to   This article does not apply to 

Instructions

There are several different options available to debug a NetWorker backup failure. This KB covers the different debugging options depending on which function of the backup process that you would like to debug. 

Log Files:

The principle logs for debugging backup failures are the policy log files which are at the following location.


Linux: /nsr/logs/policy_name/workflow_name/action_name
Windows: ..\Program Files\EMC NetWorker\nsr\logs\policy_name\workflow_name\action_name


There are workflow log files in the raw format under /nsr/logs/policy/policy_name/workflow_name/jobid.raw and a subdirectory for each action. Each child action of an action has its own log file with the jobid of that child job. When the parent action starts a child action, NetWorker creates a directory for these child action logs.

Example:

Here we can see the location of the policy logs and that the logs are of different sizes depending on the debug level that is used during the backup.  The raw files are the workflow logs, while the backup_[jobid]_logs directories contain the action logs and child action logs.

kA5f10000004JErCAM_2_0
 

The main NetWorker log file for all NetWorker operations is the daemon.raw log file. 

This is located in [NetWorker_install_dir]\logs.
 
Linux: /nsr/logs/
Windows: C:\Program Files\EMC NetWorker\nsr\logs

To read this log, you use the nsr_render_log command.

Example:

kA5f10000004JErCAM_2_1

Further Resources:

503713 : How to use nsr_render_log                                                                      
503582 : NetWorker log files and how to collect for analysis                                                                      
469489 : NetWorker List of Logs to Collect
457094 : Log files and information to collect and provide to support for general NetWorker issues
NetWorker Command Reference Guide
 

 

Save on the NetWorker Client

NetWorker client-based backups use the save process. The save process communicates with the NetWorker server, storage node (where applicable), or target backup device media. Debug can be enabled on the save process by passing the -D debug flag to the save process using either the NetWorker Management Console (NMC) or using he nsradmin command.

In the NMC, you change the 'Backup command' field in the relevant client properties to 'save -D9':

Example:

kA5f10000004JErCAM_2_2

You can do the same operation using the nsradmin command:

Example:

kA5f10000004JErCAM_2_3

Alternatively, on a linux system, you can use the printf command to make this nsradmin change in one line:

Example:

printf "show \n . type : NSR Client; name : vm-lego-231; save set : /alice\n update backup command : save -D9\n" | nsradmin -i -

 

Further Resources:

NetWorker Command Reference Guide
How to Use NetWorker nsradmin validation checking
Special Uses for the NetWorker nsradmin program Technical Note

 

Workflow Operation on the NetWorker Server 

Debugging the start of a workflow operation and detailed debug output is needed.

nsrworkflow -D9 -p [policy] -w [workflow]


This logs the workflow job debug output to the raw file in:

/nsr/logs/policy/policy_name/workflow_name/

Example:

kA5f10000004JErCAM_2_4
 

Running the nsrworkflow command initiates the job manually but use the same scheduling and level configuration options that are used as a scheduled automated backup.  Another possibility is to use the -a flag to define the nsrworkflow run as an adhoc backup which allows to override the backup schedule or level.  To specifiy the backup level that you want (not what is set for today's run of the workflow), you use the -l (or -L for virtual machine backups).

Example:

nsrworkflow -p [policy] -w [workflow] -A "'[action]' -l [level]" -a
nsrworkflow -p Mona -w Bokonon_wf -A "'backup' -l full" -a

Further Resources:

516616 : How to use the NetWorker nsrworkflow command                                                                    
513030 : How to use the NetWorker nsrpolicy command 
NetWorker 9.1.x Release Notes: 
NetWorker Command Reference Guide

 

Savefs on the NetWorker Client

The savefs command is used during client-based backups. It is sent to the NetWorker client after the backup is initiated on the NetWorker server.  savefs is this process which is responsible for determining the files and directories to back up for this specific backup run on this client.

You can obtain the exact savefs command which is being run on the client side from the raw file in the policy logs (/nsr/logs/policy/[policy name]/[workflow name]).  Then run this on the client side, adding the -D9 option:

Example:

On the NetWorker server: 

kA5f10000004JErCAM_2_5
 

And then on the client side:

kA5f10000004JErCAM_2_6
 

Further Resources:

 

Assigning Target Media on the NetWorker Server

The assignment of the correct target volume for a backup is managed by the nsrd process on the NetWorker server.  To debug this, you must temporarily increase the debug level of the nsrd process on the NetWorker server using the dbgcommand.

Example:

kA5f10000004JErCAM_2_7

After debugging is completed, you must turn off the debugging like so:

kA5f10000004JErCAM_2_8

Further Resources:

336123 : NetWorker Debug

 

Backups Waiting for Writeable Volume

If the NetWorker server cannot find a suitable NetWorker volume to write to, it will stop responding and generate an alert.  In this case, the job will be in the 'active' state.  You can check the state of the job using the nsrpolicy monitor command.

Example:

kA5f10000004JErCAM_2_9

The alert in the NetWorker Management Console gives more details on what type of volume is being sought and on which Storage Node.

Example:

kA5f10000004JErCAM_2_10

Further Resources:

 

Backups unexpectedly stopped responding due to parallelism

If the NetWorker server determines that it cannot continue with the backup because there is no free parallelism slot.  In this case, the job is in the 'queued' state.

In order to debug the parallelism, you need must increase the debug level of the nsrjobd process on the NetWorker server as shown below.  The daemon log file outputs a lot of debugging data relative to parallelism.

Example:

kA5f10000004JErCAM_2_11

 

kA5f10000004JErCAM_2_12

Further Resources:

NetWorker Performance Optimization Planning Guide
Parallelism and Target Sessions

 

Client Direct backup not working as expected

A "Client direct" backup sends data directly from the NetWorker client to the target media without first writing to the NetWorker Storage Node.

You can define in the client properties whether client direct backup should be used or not for this client instance.

kA5f10000004JErCAM_2_13

In order to troubleshoot whether client direct is working or not, you must inspect the logs as per the below example:

Example:

Log output: Client direct in operation.

Daemon log file on the NetWorker server:

91787 08/01/2014 01:37:35 PM  nsrmmd NSR notice Save-set ID '4091251191' (vm-lego-231:/NetWorker) is using direct file save with Data Domain device 'dd4500-dd.local_onetwoone'.


lsof on the NetWorker client

[root@vm-lego-231 ~]# lsof -i TCP | grep save
save       9831    root    3u  IPv4 111668      0t0  TCP vm-lego-231:23178->vm-lego-121:8985 (ESTABLISHED)
save       9831    root    5u  IPv4 111695      0t0  TCP vm-lego-231:19752->vm-lego-121:9417 (ESTABLISHED)
save       9831    root    7u  IPv4 111720      0t0  TCP vm-lego-231:31095->vm-lego-121:9035 (ESTABLISHED)
save       9831    root    8u  IPv4 111728      0t0  TCP vm-lego-231:12421->vm-lego-121:9653 (ESTABLISHED)
save       9831    root    9u  IPv4 111731      0t0  TCP vm-lego-231:33739->dd4500-dd.local:nfs (ESTABLISHED)
save       9831    root   10u  IPv4 111736      0t0  TCP vm-lego-231:60278->dd4500-dd.local:midnight-tech (ESTABLISHED)


Note: We can see that there are open TCP connections from the client both to the NetWorker server and to the DD.  If you need to know which processes exactly on the NetWorker server are connected to, you can cross-check with lsof on the server.  The fourth column is the file descriptor being used. 

On a windows system, you could see similar output by using resmon:  Start - Run - resmon - Network tab - TCP Connections


Log output:  Backup is not using client direct.

Daemon log file on the NetWorker server:

91797 08/01/2014 01:57:51 PM  nsrmmd NSR severe Unable to perform direct file save with Data Domain device 'ONETWOONE'; setting up traditional save for save-set ID '4024143566' (vm-lego-231:/NetWorker)


Note:  Looking for the word traditional in the log gives you this output quickly.  If you need to find out why it is not using client direct, start with the NetWorker Administration Guide's list of conditions that need to be met for client direct to work.  The most common reasons would be that the client has no direct network access to the DD from the NIC it is using or that the name resolution is not working correctly from the client.

lsof on the NetWorker client:

[root@vm-lego-231 ~]# lsof -i TCP | grep save
save      10114    root    3u  IPv4 123335      0t0  TCP vm-lego-231:46461->vm-lego-121:8985 (ESTABLISHED)
save      10114    root    5u  IPv4 123369      0t0  TCP vm-lego-231:12593->vm-lego-121:9417 (ESTABLISHED)
save      10114    root    7u  IPv4 123392      0t0  TCP vm-lego-231:63952->vm-lego-121:9035 (ESTABLISHED)
save      10114    root    8u  IPv4 123400      0t0  TCP vm-lego-231:29597->vm-lego-121:9653 (ESTABLISHED)


Note:  Only TCP connections to the NetWorker Server (which is also the Storage Node in this example) are open here.  There is no TCP connection open to the DD.  All the data is going to the Storage Node.

Further Resources:

NetWorker Performance Optimization Planning Guide


Parallel Save Stream Backups

To debug PSS backups. Ensure that the 'parallel save stream' property is ticked in the client resource in the NetWorker Management Console.  Modify the save command to put it in debug as per number 1 above.  Also, create an empty file in ../nsr/debug called 'mbsdopen'.  This provides extra debug logging both on the client in /nsr/tmp and in the policy logs on the NetWorker server (see number 1 above).

Example:

kA5f10000004JErCAM_2_14

kA5f10000004JErCAM_2_15

kA5f10000004JErCAM_2_16

 

Further Resources:

How to Troubleshoot NetWorker Parallel Save Stream backups
NetWorker Performance Optimization Planning Guide

 

 

NetWorker Storage Node nsrmmd process not working as expected as it writes to the target media.

 

You can increase the debug level of the nsrmmd processes using the dbgcommand (described in number 7 above).  You can either increase the debug level of all the nsrmmd processes or else use operating system tools to identify which nsrmmd process is active:

kA5f10000004JErCAM_2_17
 

Further Resources:

479665 : Triage Article: Troubleshooting Tape Library Problems in NetWorker
NetWorker Data Domain Boost Integration Guide

Additional Information



Other Debugging Tips for Specific NetWorker Technologies:

Affected Products

NetWorker

Products

NetWorker, NetWorker Series